A classical analogue of negative information 
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Recently, it was discovered that the quantum partial information needed to merge one party's state 
with another party's state is given by the conditional entropy, which can be negative [Horodecki, 
Oppenheim, and Winter, Nature 436, 673 (2005)]. Here we find a classical analogue of this, based 
on a long known relationship between entanglement and shared private correlations: namely, we 
consider a private distribution held between two parties, and correlated to a reference system, and 
ask how much secret communication is needed for one party to send her distribution to the other. We 
give optimal protocols for this task, and find that private information can be negative - the sender's 
distribution can be transferred and the potential to send future distributions in secret is gained 
through the distillation of a secret key. An analogue of quantum state exchange is also discussed 
and one finds cases where exchanging a distribution costs less than for one party to send it. The 
results give new classical protocols, and also clarify the various relationships between entanglement 
and privacy. 
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Introduction. While evaluating the quality of infor- 
mation is difficult, we can quantify it. This was first done 
by Shannon [Ij who showed that the amount of informa- 
tion of a random variable X is given by the Shannon 
entropy H{X) = —'}2,Px{x)\og2Px{x) where Px{x) is 
the probability that the source produces X — x from 
distribution Px . If n is the length of the message (of 
independent samples of X) we want to communicate to 
a friend, then ~ nH{X) is the number of bits required 
to send them. If our friend already has some prior infor- 
mation about the message we are going to send him (in 
the form of the random variable Y), then the number of 
bits we need to send him is less, and is given by n times 
the conditional entropy H{X\Y) = H{XY) - H{Y), ac- 
cording to the Slepian- Wolf theorem 1] . 

In the case of quantum information, it was shown 
by Schumacher [3] that for a source producing a string 
of n unknown quantum states with density matrix p^, 
~ nS{A) quantum bits (qubits) are necessary and suf- 
ficient to send the states where S{A) = —TipAlogpA 
is the von Neumann entropy (we drop the explicit de- 
pendence on p in S{A)). One can now ask how many 
qubits are needed to send the states if the receiver has 
some prior information. More precisely, if two parties, 
Alice and Bob, possess shares A and S of a bipartite 
system AB described by the quantum state pab, how 
many qubits does Alice need to send Bob so that he can 
locally prepare a bipartite system A'B described by the 
same quantum state (classical communication is free in 
this model). We say that Bob has some prior informa- 
tion in the form of state ps = Trp^s, and Alice wants 
to merge her state with his by sending him some partial 
quantum information. 

Recently, it was found that a rate of S{A\B) — 
S{AB) — S{B) qubits are necessary and sufficient for 
this task. More mathematically: just as in Schumacher's 



quantum source coding Q , we consider a source emitting 
a sequence of n unknown states, but the statistics of the 
source, i.e. the average density matrix of the states, is 
known. The ensemble of states which realize the density 
matrix is however unspecified. We then demand that the 
protocol allows Alice to transfer her share of the state to 
Bob with high probability for all possible states from the 
ensemble. A more compact way to say this is to imagine 
that the state which Alice and Bob share is part of some 
pure state shared with a reference system R and given 
by \'iP)abr such that pab is obtained by tracing over 
the reference system. A successful protocol will result in 
Pab being with Bob, and |V')ab_r should be virtually un- 
changed, while entanglement is consumed by the protocol 
at rate S{A\B). 

The quantity, S{A\B) is the quantum conditional en- 
tropy, and it can be negative [1, 0, [1] . This seemingly odd 
fact now has a natural interpretation [3] - the conditional 
entropy quantifies how many qubits need to be sent from 
Alice to Bob, and if it is negative, they gain the potential 
to send qubits in the future at no cost. That is, Alice can 
not only send her state to Bob, but the parties are addi- 
tionally left with maximally entangled states which can 
be later be used in a teleportation protocol to transmit 
quantum states without the use of a quantum channel. 
This is the operational meaning of the fact that partial 
information can be negative in the quantum world. 

A classical model. In order to further understand 
the notion of negative information, we are interested in 
finding some classical analogue of it. Indeed we will find 
a paradigm in which not only is there a notion of neg- 
ative information, but also the rate formulas and proof 
techniques are remarkably similar. We shall take as our 
starting point the similarity between entanglement and 
private correlations, a fact that was used in construct- 
ing the first entanglement distillation protocols, was used 
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to conjecture new types of classical distributions @, but 
which was first made fully explicit by Collins and Popescu 
[lo| . In this paradigm, maximally entangled states are 
replaced by perfect secret correlations (a "key") with 
probability distribution Wxf(0, 0) = Wjfy(l, 1) = i. By 
secret, we mean that a third party, an eavesdropper Eve, 
is uncorrelated with Alice and Bob's secret bit. We then 
replace the notion of classical communication by pub- 
lic communication (i.e., the eavesdropper gets a copy of 
the public messages that Alice and Bob send to each 
other) . Quantum communication (the sending of coher- 
ent quantum states) is replaced by secret communication, 
i.e. communication through a secure channel such that 
the eavesdropper learns nothing about what is sent. We 
thus have sets of states (i.e. classical distributions be- 
tween various parties and an eavesdropper), and a class of 
operations - local operations and public communication 
(LOPC). Under LOPC one cannot increase secrecy, just 
as under local operations and classical communication 
(LOCC) one cannot increase entanglement. The anal- 
ogy has the essential feature, as in entanglement theory, 
that there is a resource (secret key, pure entanglement) 
which allows for the transfer of information (private dis- 
tributions, quantum states), and this information can be 
manipulated (by means of classical or public informa- 
tion), and transformed into the resource. This allows for 
the possibility of negative information. We will further 
be able to make new statements about the analogy. For 
example, we will find indications for an analogue of pure 
states, mixed states, and various types of GHZ states 

[HI. 

Looking at the quantum model, we should consider 
an arbitrary distributed source between Alice and Bob, 
described by a pair of random variables with probabil- 
ity distribution PxY] furthermore we need a "purifica- 
tion" , that is an extension of this distribution to a distri- 
bution PxYZ with Z being held by a party R, which 
we call the reference (who has the marginal distribu- 
tion Pz)- According to this and [13], the natural ap- 
proach will be as follows. A pure quantum state held be- 
tween two parties has a Schmidt decomposition |V')tj?, = 
X^i \/p(i)|6j) ® \fi)i with orthonormal bases {le^}} and 
An analogue of this is a private bi-disjoint distri- 
bution, i.e. a distribution Ptz (where T = XY), 



PTz{tz) = ^p(j)PT|/=»(i)/'z|/=^(^), 



(1) 



with conditional distributions Pz\i and Pt\i, such that 
PT\I=^{t)PT\I=J{t) = and Pz\iMPz\i=,{z) = for 
i ^ j. Just as the quantum system TR is in a prod- 
uct state between T and R once i is known, so the bi- 
disjoint distribution is in product form Pj^z\i=i{iz) =^ 
PT\i=i{t)Pz\i=i{z) once i is known. And just as a pure 
quantum state is decoupled from any environment, so our 
distribution should be decoupled from the eavesdropper. 
Note that it appears necessary here to introduce a fourth 



party E, something we could avoid in the quantum set- 
ting by demanding that the overall pure state is preserved 
- for distributions the meaning of this is staying decou- 
pled from the eavesdropper, which we have to distinguish 
from the reference |12|. Introducing the eavesdropper 
into the notation, we have Pxyze = Pxvz ® Pe- Such 
distributions we call private, meaning that E is decou- 
pled. In that regard, we shall speak of secret distribu- 
tions (between Alice and Bob) where they are decoupled 
from R and E - following terminology introduced on 3] ■ 
We will provide further justification for the appropriate- 
ness of this analogue of pure states after we have fully 



analysed merging and negative information Note 
however, that it has the following desired property: in 
the quantum case, considering a purification of the AB 
system allows us to enforce the requirement that the pro- 
tocol succeed for particular pure state decompositions of 
PAB ■ Likewise the distribution Pxyz allows us to enforce 
the requirement that the protocol succeed for a decom- 
position of the distribution Pxv, with the record being 
held by R. 

We now introduce the analogue of quantum state merg- 
ing - distribution merging - which naturally means that 
at the end Bob and the reference should possess a sample 
XYZ from the distribution Pxyz, with Z held by the 
reference and XY by Bob. The protocol may use public 
communication freely; we will consider only the rate of 
secret key used or created. We also go to many copies of 
the random variables - thus we denote by X"' many inde- 
pendent copies of random variable X, while X" denotes 
the output sample of length n. Formally: 

Definition 1 Given n instances of a private bi-disjoint 
distribution Pxyz between AB and R, a distribution 
merging protocol between a sender who holds X and re- 
ceiver who holds Y , is one which creates, by possibly us- 
ing k secret key bits and free public communication, a 
distribution P'~, ,~ ~ ~ such that P' approximates 

^XYZE ® ^"uv /"^^ large n (in total variational, or £^ , 
distance). Here I is the number of secret bits shared at 
the end between Alice and Bob; Alice has and Bob 
V^X^Y"-. 

The rate of consumption of secret key for the protocol, 
called its secret key rate, is defined to be ^{k — I). 

We can now state our main result: 

Theorem 2 A secret key rate of 

I{X : Z) - I{X ■.Y) = H{X\Y) - H{X\Z) (2) 

bits is necessary and sufficient to achieve distribution 
merging. Here, I{X : Y) := H{X) + H{Y) ~ H{XY) is 
the mutual information. When this quantity is nonnega- 
tive, it is the minimum rate of secret key consumed by an 
optimal merging protocol. When it is negative, not only 
is distribution merging achieved, but I{X : Y) — I{X : Z) 
bits of secret key remain at the end of the protocol. 
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Before proving this theorem, and introducing the pro- 
tocol in full generality, it may be useful to discuss three 
very simple examples: 

1. Alice's bit is independent of Bob's bit, but corre- 
lated with Eve: Pxyz{0, 0, 0) = PxYz{l, 0, 1) = 5 
In this case, Alice must send her bit to Bob through 
a secret channel, consuming one bit of secret key. 

2. Alice and Bob have a perfect bit of shared secret 
correlation: Bob can locally create a random pair 
of correlated bits, and Alice and Bob keep the bit of 
secret correlation as secret key (which they may use 
in the future for private communication). There is 
one bit of negative information. 

3. The distribution Pxyz(0, 1,1) = Pxyz(0,0,0) = 
PxYzihOA) = PxYzihhO) = i: If Z = Alice 
and Bob are perfectly correlated, and \i Z — \ they 
are anti-correlated. In such a case, Alice can tell 
Bob her bit publicly, and because an eavesdropper 
doesn't know Bob's bit, she would not be able to 
know the value of Z . Bob will however know Z and 
can locally create a random pair of anti-correlated 
bits or correlated bits depending on the value of 
Z . Thus, the distribution merging is achieved with 
one bit of public communication and no private 
communication. This reminds one of the state 
merging problem for the quantum state pab = 
i(|00)(00| + |11)(11|) whose purification on R is 
the GHZ state where the merging is achieved with 
one bit of classical communication and no quantum 
communication. Another potential classical ana- 
log of the GHZ is the distribution Pxyz{^, 1,1) = 
-PxFz(0, 0,0) = 1/2 [13], which has perfect corre- 
lations for all sites like for the GHZ state; it also 
has a merging cost of zero (although zero classi- 
cal communication unlike in the quantum case). A 
distribution which has both the above features of 
the GHZ is the distribution with an equal mixture 
ofllll, 122, 212, 221, 333, 344, 434, 444} inspired by 

15| . It has perfect correlations (1 or 2 on one site is 
correlated with 1 or 2 on the others, and likewise for 
3 and 4) , as well as the ability of one of the parties 
to create secret key by informing the other parties 
of her variable. Like the first GHZ like candidate, 
it also has no secret communication cost for distri- 
bution merging, and public communication cost of 
one bit, reminiscent of the quantum GHZ state. 

Proof of Theorem \^ We now describe the general 
protocol for distribution merging. We will give two proofs 
of achievability: the first is very simple and uses recycling 
of the initial secret key resources. Namely, let Alice make 
her transmission of Slepian-Wolf coding [2| secret, using 
a rate of H{X\Y) secret bits. This gives Bob knowledge 
of XY , which by the bi-disjointness of Pxyz informs him 



of Z [rather, the label / in ([T])]. Hence he can produce a 
fresh sample XY of the conditional distribution Pxy\z ~ 
this solves the merging part. Now only observe that Alice 
and Bob are still left with the shared X; from it they 
can extract H(X\Z) secret bits via privacy amplification 
[lit . '-^^ random hashing. By repeatedly running this 
protocol, we can recover the startup cost of providing 
H{X\Y) secret bits, which is only later recycled - at 
least if the rate ([2]) is positive. In the appendix we show 
a direct proof in one step, which produces secret key if 
(121) is negative without the need to provide some to start 
the process. 

Now we turn to the converse, namely that this pro- 
tocol is optimal. Just as in state merging, the proof 
comes from looking at monotones. Assuming first that 
secret key is consumed in the protocol, then the ini- 
tial amount of secrecy that Bob has with Alice and the 
reference R is H{K) + I{Y : XZ) where i^T is a ran- 
dom variable describing the key. By monotonicity of se- 
crecy under local operations and public communication 
this must be greater than the final amount of secrecy 
he has with them; but since he then has XY , this is 
I{XY : Z) = I(XY : Z). Hence H(K) > I{XY : 
Z) - I{XZ : Y) ^ I{X : Z) - I{X : Y) as required. 
If key is acquired in the protocol, then the value H{K) 
should be put as part of the final amount of secrecy, and 
we have again H{K) < I{X : Y) - I{X : Z). □ 

The cost of distribution merging might appear quite 
different to the cost of quantum state merging. Actually 
this is not the case. Since \iP)abr is pure, we may rewrite 

S{A\B)^^[I{A:R)-I{A:B)], (3) 

in terms of the quantum mutual information I{A : B) := 
S{A) + S{B) - S{AB). This looks Uke the cost of dis- 
tribution merging, only with a mysterious factor of 1/2. 
The factor is the same one that accounts for the fact that 
while one bit of secret key has I{A : B) = 1 and can be 
used in a one-time pad protocol for one bit of secret com- 
munication, a singlet has I{A : B) = 2 but can teleport 
only one qubit. For an alternative explanation, see also 

m 

Pure and mixed state analogues. Note that a 
crucial part of the merging protocol is that once Bob 
knows Alice's variable, he effectively knows Z and can 
thus recreate the distribution (more precisely, he knows 
the product distribution he shares with R). Recreating 
the distribution would not be as easy if the total distribu- 
tion Pxyz were not bi-disjoint, which further serves to 
motivate our definition of bi-disjoint distributions as the 
analogues of pure quantum states (although only for this 
particular merging task). Nevertheless, one might won- 
der if we have not overly restricted our model. Let us go 
back to a general distribution Pxyz of Alice, Bob and 
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the reference, and observe that it can always be written 
PxYZ = {idxY <E)A)PxYZ' (4) 

with idxY the identity, Pxyz * bi-disjoint distribution, 
and a noisy channel (a stochastic map) K : Z ^ Z . Up 
to relabelling of Z there is in fact a unique minimal dis- 
tribution, denoted P xy~z^ sense that every other 
P can be degraded to P by locally applying a (determin- 
istic) channel h. : Z ^ Z. One way of doing this is by 
having Z he a, record of which probability distribution 
needs to be created, conditional on each XY. A chan- 
nel can then act on the record Z to create the needed 
probability distribution Pz\xy- I-G- we define (cf. [isj ) 

z = $(xr) ■.= Pz\xY, 

as an element of the probability simplex - this means that 
pairs XY are labelled by the same Z (which is a deter- 
ministic function $ of XY) if and only if the conditional 
distributions Pz\xy sre the same. The channel A has 
the transition probabilities A(z|z) ^ 'z{z) ~ Pz=z\xy- 
Note that P is indeed bi-disjoint. Let us call this P xy~z 
the purified version of Pxyz- Note the beautiful analogy 
to the quantum case, where every mixed state pabr on 
ABR can be written 

PABR = (idAS ®A)V'as"R' 

with a quantum channel A : R ^ R and an essentially 
unique pure state 4'ab'r i^P ^'^ local unitaries). 

Theorem 3 For general Pxyz , the optimal rate of dis- 
tribution merging is that of the purified version P xy'Z' 
i.e. 

I{X -.Z) - I{X -.Y) = H{X\Y) - H{X\Z). (5) 

Clearly, it is achievable: we have a protocol at this rate 
for P xY'z^ which must work for Pxyz as well, since the 
latter is obtained by locally degrading Z ^ Z which 
commutes with the merging protocol acting only on Alice 
and Bob and makes the secrecy condition for the final key 
only easier to satisfy. 

To show that the rate ([5]) is optimal, we shall argue 
that successful merging with reference Z implies that the 
protocol is actually successful for reference Z, at which 
point we can use the previous converse for "pure" (bi- 
disjoint) distributions. Observe that Bob at the end of 
the protocol has to produce samples Ar"F" such that 
Px^Y^Z" ~ Px^Y^z^- Assume now that it were true 
that with high probability (over the joint distribution of 

x"y"Z"X"r"), 



This in fact implies that merging is achieved for the dis- 
tribution P, 
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and both final terms are small. Furthermore, the secret 
key (possibly) distilled at the end of the protocol has to 
be uncorrelated to X"y", and since this data includes 
knowledge of Z , the key will not only be secret from a 
reference but even against Z . 

Now, unfortunately we cannot argue ([S]) for a given 
protocol (and insofar the situation is understood, it may 
not even be generally true [l^); however, we can modify 
the protocol slightly - in particular losing only a sub- 
linear number of key bits - such that ([6]) becomes true. 
We invoke a result on so-called "blind mixed-state com- 
pression" [iO, HH (see also (111): notice that Bob has 
to output (for most Z) a sample of the conditional dis- 
tribution PxY\z-, but that Ahce and Bob together have 
access only to one sample of that distribution, with- 
out knowing Z. The central technical result in [23] is 
that every such process must preserve a lot of correla- 
tion between the given and the produced sample, in the 
sense that Vy[^[XiYi) ^ <I>(Ar/17)}, with random in- 
dex /, is small. In other words, with high probability, 
the string $"(Ar"y") is within a small Hamming ball 
around 'Z = $"(A:"r"). Since Bob knows F" already, 
Alice will need to send only negligible further information 
about X" to Bob (invoking Slepian-Wolf another time) 
so that he can determine the correct Z with high prob- 
ability. On the other hand, privacy amplification incurs 
only a negligible loss in rate to make the final secret key 
independent of this further communication (namely just 
its length), and hence of Z . Hence, we have a protocol 
that effectively puts Bob in possession of Z, of which the 
final secret key is independent; hence he could just out- 
put a sample from P xy^^ which would yield a valid and 
asymptotically correct protocol. 

The expression in Eq. ([5]), when negative and opti- 
mised over pre-processing, was previously shown to be 
the rate for secret key generation 23, 13, 2^. Here, as in 
the quantum case, we find that distribution merging pro- 
vides an interpretation of this quantity without looking 
at optimisations, and for both the positive and negative 
case. 

Note that for given Pxyz^ if Qxyz' = 
iiAxY ®A)PxYZ with A sufficiently close to the identity, 
the two distributions have the same purification, leading 
to the conclusion that our result on distribution merging 
is robust under small perturbations of the reference. 
Note however that a general perturbation of Pxyz by an 
arbitrary small change in the probability density leads to 
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a drastic discontinuity: namely, a generic perturbation 
Qx'Y' z' will have trivial purification Z = X'Y' because 
all conditional distributions Qz'\x'Y' will be different. 
Thus, for Q the merging cost will be H{X'\Y') - 
essentially Slepian-Wolf coding with Bob outputting the 
very X'Y' of the source, so Alice and Bob's common 
knowledge of X' cannot be turned into secret key. 
However, this is consistent with the extreme case of 
PxYZ = PxY ® Pz, which has merging cost —I{X : Y) 
since Bob can locally produce a fresh sample from 
PxY, and he can extract I{X : Y) secret bits from the 
correlation XY with Alice. 

Distribution exchange. We now turn to finding an 
analogue of quantum state exchange In the quan- 

tum task, not only docs Alice send her state to Bob, but 
Bob should additionally send his state to Alice, which is 
to say that the final state is just the initial state with 
Alice and Bob's shares permuted. Amazingly, this can 
require less resources than if only Alice is required to 
send to Bob. In general, the number of qubits that need 
to be exchanged can be said to quantify the uncommon 
quantum information between Alice and Bob, because 
this is the part which has to sent be to their partner. 
We can consider the analogy of this, where Alice and 
Bob must exchange distributions. This minimal rate of 
secret key clearly must be non-negative, since Alice and 
Bob could otherwise continue swapping their distribution 
and create unlimited secret key from some given corre- 
lation and LOPC. Note that the rate zero is indeed pos- 
sible. The distribution Pxyz(0,0,0) = Pxyz(l,l,l) = 
PxYz{0, 1,2) = PxFz(l,0, 2) = J, for instance, has the 
property that exchanging the distribution has zero ex- 
change cost (because it is symmetric), while the cost of 
Alice merging her distribution to Bob's is I{X : Z) = ^. 

In [2^ . a lower bound for quantum state exchange 
given in terms of one-way entanglement distillation be- 
tween R and each of the parties was proven. A similar 
lower bound K^{Z)X) + K^{Z)Y), where K^{Z)T) 
is the distillable key (using only one-way communication 
from R) can be proven in the context of distribution ex- 
change. For upper bounds, one can introduce protocols, 
for example Slepian-Wolf coding in either direction is also 
possible, costing H{X\Y) + H(Y\X). A more sophisti- 
cated protocol that is sometimes better uses results from 
23]: the rate I{X : Z) - I{X : Y) + I{XY : W) can 
be achieved (or the same quantity with X and Y in- 
terchanged, whichever is smaller); this quantity is min- 
imized over distributions W such that X — W — Y is a 
Markov chain. The protocol is for Alice to merge her 
X to Bob, which consumes I{X : Z) — I{X : Y) secret 
bits; then Bob locally creates not XY\Z as with merging, 
but rather W\Z and then W is essentially communicated 
back to Alice ~ but by [13] only a rate I[XY : W) needs 
to be sent. Then, based on W , each one creates a sample 
X and Y , respectively. 



An interesting aspect of quantum state exchange is 
that the rate given by the sum of both parties' mini- 
mal rate of state merging S{A\B) -\- S{B\A) is usually 
not attainable (although as noted above, one can some- 
times beat it). This is because if Alice first merges her 
state with Bob, Bob will not be able to merge his state 
with Alice, but must send at the full rate S{B). This 
is because after Alice merges, she is left with nothing, 
being unable to clone a copy of her state. This motivates 
us to consider the analogue of cloning, especially since 
naively, classical variables can be copied. However, we 
need a different kind of copying to enable Alice and Bob 
to merge their distributions simultaneously: it would be 
for Alice to create a fresh, independent sample from the 
conditional distribution Px\yz of her A", given Y and Z 
(which are unknown to her). If she could do that, she 
would be able to merge her first sample to Bob at se- 
cret key cost I{X : Z) — I{X : Y), and then he could 
merge his Y to her second sample (which we designed 
to have the same joint distribution with YZ), at cost 
I{Y : Z) — I{X : Y). Since we know that the sum 

I{X : Z) + I{Y : Z) -2 I{X : Y) 

= H{X\Y) + H{Y\X)-H{X\Z)-H{Y\Z) 

is not in general an achievable rate, this hypothetical 
cloning cannot be always possible. Such cloning is indeed 
always impossible, unless the various conditional distri- 
butions Px\YZ are either identical or have disjoint sup- 
port Note that in this case, Pxyz is bi-disjoint for 
the cut X-YZ. A different viewpoint is that the cloning 
would increase the (secret) correlation between Alice and 
Bob, which of course cannot be unless they can privately 
communicate; this seems to be another way of thinking 
about a classical analogue of the no-cloning principle [29[ . 

Conclusion. In this paper, we have described a 
classical analogue of negative quantum information, and 
we find that the similarities between quantum informa- 
tion theory and privacy theory extend very far in this 
analogy (at least in the present context), including no- 
cloning, pure and mixed states, and GHZ-type correla- 
tions. Quantum state merging (with reference systems 
such that the overall state is pure or mixed) and state ex- 
change lead to similar protocols in the case of private dis- 
tributions which have many properties in common with 
their quantum counterparts. This is part of a body of 
work exploring the similarities between entanglement and 
classical correlations, which, it is hoped, will stimulate 
progress in both fields, for instance, on the question of 
the possible existence of bound information j9l| . 
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APPENDIX Direct proof of Theorem [H For 

the second, direct, proof of achievab ility , we will need the 
sampling lemma, which is proved in 27 1 (see also [s^ and 

My- 

Lemma 4 Consider a distribution Pjjv of random vari- 
ables U and V ( with marginals Py and Pu ), and n inde- 
pendent samples U^V^ = UiVi , . . . , UnVn from, this dis- 
tribution. Then for every 7 > and sufficiently large n, 
there are N < 2"(-f('^-^)+'>') sequences u^^^ from such 
that, with 



1 ^ 

■ 1 



D{Q\\P^ 



< 2" 



(7) 



(8) 



Here, D denotes the relative entropy. Furthermore, such 
a family of sequences is found with high probability by 
selecting them independently at random with probability 
distribution P®" . 

In such a situation we say that the distribution of V", 
Pv" is covered by the N sequences, meaning that the 
distribution " is approximated with high accuracy by 
choosing only slightly more than 2"^('^-^) sequences from 

We achieve distribution merging using a protocol ex- 
tremely reminiscent of state merging. In state merg- 
ing, one adds a maximally entangled state of dimension 
nS{A\B) bits, and then performs a random measurement 
on Pa and the pure entanglement, the result of which is 
communicated to Bob. Here, Alice and Bob add a secret 
key of size H(K), and the analogy of a random measure- 
ment will be a random hash (described below), the result 
of which is communicated to Bob. In state merging, a 
faithful protocol has the property that pn is unchanged 
and Bob can decode his state to pA after learning Alice's 
measurement. Here, a successful protocol is likewise one 
which allows Bob to learn X, while the distribution of 
R is unchanged if one conditions on the result of Alice's 
measurement. 

Let us first take the case when I{X : Z) — I{X : Y) 
is negative. Alice and Bob previously decide on a ran- 
dom binning, or code, which groups Alice's 2"^^'^) se- 
quences into 2"^(^l^) sets of size just under 2"^('^-^). 
Each of these sets are numbered by Co and is called the 
outer code. Within each set, we further divide the se- 
quences into 2"[^('^-^)~^('''^--^)l sets containing just over 
2ni{X:Z) sequences. These smaller sets are labeled by Ci, 
the inner code. Alice then publicly broadcasts the num- 
ber Co of the outer code that her sequence is in (this 



takes nH{X\Y) bits of public communication to Bob). 
Now, based on learning Co, Bob will know A" by the 
Slepian-Wolf theorem We say that he can decode 
Alice's sequence. Because the distribution Pxyz is bi- 
disjoint, and Bob knows A" and F", he must know Z". 
He can now create the distribution Pxy\z=z ^ Pxy\z=z- 
He has thus succeeded in obtaining XY such that the 
overall distribution is close to Pxyz- Furthermore, the 
distribution is private - each set (or code) in Co has more 
than 2"^('^--^) elements (i.e. codewords) [recall that there 
are 2'^^'^^'-^^ outer codewords, and I{X : Y) > I{X : Z)]. 
The sampling lemma then tells us that P's distribution 
is unchanged i.e. Pz"|Co=c ~ -Pz"' which means that an 
eavesdropper who learns which code Co Alice's sequence 
is in, doesn't learn anything about the sequence that R 
has. 

Next, we see that Alice and Bob gain n[I{X : Y) — 
I{X : Z)] bits of secret key. Since Alice and Bob both 
know A", they both know which inner code Ci it lies in, 
and this they use as the key. There are 2"[-f(^-^)--f(^-^)l 
of them, and each contains just over 2"-'^('^--^) codewords 
in it. Thus, from the covering lemma, i?'s state is in- 
dependent of its value, thus she (and consequently any 
eavesdropper) has arbitrarily small probability of know- 
ing its value. 

Now, in the case where I{X : Z) — I{X : Y) is positive, 
Alice and Bob simply use I{X : Z) - I{X : Y) bits of 
secret key. Since each bit of key decreases I{X : Z) — 
I{X : y) by 1, they need this amount of key until the 
quantity I{X : Z) — I{X : Y) is negative, and then the 
preceding proof appHes. We thus see that I{X : Z) — 
I{X : Y) bits of key are required to perform distribution 
merging, and if it is negative, one can achieve distribution 
merging, while obtaining this amount of key. □ 
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